Human modeling and relighting are two fundamental problems in computer vision and graphics, where high-quality datasets can largely facilitate related research. However, most existing human datasets only provide multi-view human images captured under the same illumination. Although valuable for modeling tasks, they are not readily used in relighting problems. To promote research in both fields, in this paper, we present UltraStage, a new 3D human dataset that contains more than 2K high-quality human assets captured under both multi-view and multi-illumination settings. Specifically, for each example, we provide 32 surrounding views illuminated with one white light and two gradient illuminations. In addition to regular multi-view images, gradient illuminations help recover detailed surface normal and spatially-varying material maps, enabling various relighting applications. Inspired by recent advances in neural representation, we further interpret each example into a neural human asset which allows novel view synthesis under arbitrary lighting conditions. We show our neural human assets can achieve extremely high capture performance and are capable of representing fine details such as facial wrinkles and cloth folds. We also validate UltraStage in single image relighting tasks, training neural networks with virtual relighted data from neural assets and demonstrating realistic rendering improvements over prior arts. UltraStage will be publicly available to the community to stimulate significant future developments in various human modeling and rendering tasks.
translated by 谷歌翻译
Background and Purpose: Colorectal cancer is a common fatal malignancy, the fourth most common cancer in men, and the third most common cancer in women worldwide. Timely detection of cancer in its early stages is essential for treating the disease. Currently, there is a lack of datasets for histopathological image segmentation of rectal cancer, which often hampers the assessment accuracy when computer technology is used to aid in diagnosis. Methods: This present study provided a new publicly available Enteroscope Biopsy Histopathological Hematoxylin and Eosin Image Dataset for Image Segmentation Tasks (EBHI-Seg). To demonstrate the validity and extensiveness of EBHI-Seg, the experimental results for EBHI-Seg are evaluated using classical machine learning methods and deep learning methods. Results: The experimental results showed that deep learning methods had a better image segmentation performance when utilizing EBHI-Seg. The maximum accuracy of the Dice evaluation metric for the classical machine learning method is 0.948, while the Dice evaluation metric for the deep learning method is 0.965. Conclusion: This publicly available dataset contained 5,170 images of six types of tumor differentiation stages and the corresponding ground truth images. The dataset can provide researchers with new segmentation algorithms for medical diagnosis of colorectal cancer, which can be used in the clinical setting to help doctors and patients.
translated by 谷歌翻译
Recent years we have witnessed rapid development in NeRF-based image rendering due to its high quality. However, point clouds rendering is somehow less explored. Compared to NeRF-based rendering which suffers from dense spatial sampling, point clouds rendering is naturally less computation intensive, which enables its deployment in mobile computing device. In this work, we focus on boosting the image quality of point clouds rendering with a compact model design. We first analyze the adaption of the volume rendering formulation on point clouds. Based on the analysis, we simplify the NeRF representation to a spatial mapping function which only requires single evaluation per pixel. Further, motivated by ray marching, we rectify the the noisy raw point clouds to the estimated intersection between rays and surfaces as queried coordinates, which could avoid \textit{spatial frequency collapse} and neighbor point disturbance. Composed of rasterization, spatial mapping and the refinement stages, our method achieves the state-of-the-art performance on point clouds rendering, outperforming prior works by notable margins, with a smaller model size. We obtain a PSNR of 31.74 on NeRF-Synthetic, 25.88 on ScanNet and 30.81 on DTU. Code and data are publicly available at https://github.com/seanywang0408/RadianceMapping.
translated by 谷歌翻译
电线杆和建筑物边缘经常是城市道路上可观察到的对象,为各种计算机视觉任务提供了可靠的提示。为了重复提取它们作为特征并在离散激光镜头框架之间进行注册,我们提出了第一个基于学习的功能分割和LIDAR点云中3D线的描述模型。为了训练我们的模型,而无需耗时和乏味的数据标记过程,我们首先生成了目标线基本外观的合成原始图,并构建一个迭代线自动标记的过程,以逐步完善真实激光扫描的线路标签。我们的分割模型可以在任意规模的扰动下提取线,我们使用共享的EDGECONV编码层共同训练两个分割和描述符头。基于模型,我们可以在没有初始转换提示的情况下构建一个高度可用的全局注册模块,用于点云注册。实验表明,我们基于线的注册方法对基于最先进的方法的方法具有很高的竞争力。我们的代码可在https://github.com/zxrzju/superline3d.git上找到。
translated by 谷歌翻译
基于步态阶段的控制是步行AID机器人的热门研究主题,尤其是机器人下限假体。步态阶段估计是基于步态阶段控制的挑战。先前的研究使用了人类大腿角的整合或差异来估计步态阶段,但是累积的测量误差和噪声可能会影响估计结果。在本文中,提出了一种更健壮的步态相估计方法,使用各种运动模式的分段单调步态相位大角模型的统一形式。步态相仅根据大腿角度估算,这是一个稳定的变量,避免了相位漂移。基于卡尔曼滤波器的平滑液旨在进一步抑制估计步态阶段的突变。基于提出的步态相估计方法,基于步态阶段的关节角跟踪控制器是为跨股骨假体设计的。提出的步态估计方法,步态相和控制器通过在各种运动模式下的步行数据进行离线分析来评估。基于步态阶段的控制器的实时性能在经际假体的实验中得到了验证。
translated by 谷歌翻译
本文提出了一种新颖的统一特征优化(UFO)范式,用于训练和在现实世界和大规模场景下进行深层模型,这需要集合多个AI功能。不明飞行物的目标是通过对所有任务进行大规模预修。与众所周知的基础模型相比,UFO具有两个不同的重点,即相对较小的模型大小,没有适应性成本:1)UFO以多任务学习方式将广泛的任务挤入中等尺寸的统一模型中并在转移到下游任务时进一步修剪模型大小。 2)不明飞行物不强调转移到新任务。相反,它旨在使修剪模型专门用于一个或多个已经看到的任务。有了这两个特征,UFO为灵活的部署提供了极大的便利,同时保持了大规模预处理的好处。 UFO的一个关键优点是修剪过程不仅可以减少模型的大小和推理消耗,而且还提高了某些任务的准确性。具体而言,UFO考虑了多任务培训,并对统一模型产生了两倍的影响:一些密切相关的任务具有相互利益,而某些任务相互冲突。不明飞行物设法通过新颖的网络体系结构搜索(NAS)方法来减少冲突并保留相互利益。对各种深度表示学习任务(即面部识别,人重新识别,车辆重新识别和产品检索)的实验表明,从UFO中修剪的模型比单件任务训练的对应物更高,但却具有更高的准确性较小的型号大小,验证不明飞行物的概念。此外,UFO还支持发布170亿个参数计算机视觉(CV)基础模型,该模型是该行业中最大的CV模型。
translated by 谷歌翻译
在这项工作中,我们提出了叙述,这是一种新颖的管道,可以以逼真的方式同时编辑肖像照明和观点。作为一种混合神经形态的面部模型,叙述了几何学感知生成方法和正常辅助物理面部模型的互补益处。简而言之,叙述首先将输入肖像转变为粗糙的几何形状,并采用神经渲染来产生类似于输入的图像,并产生令人信服的姿势变化。但是,反演步骤引入了不匹配,带来了较少面部细节的低质量图像。因此,我们进一步估计了师范的肖像,以增强粗糙的几何形状,从而创建高保真的物理面部模型。特别是,我们融合了神经和身体渲染,以补偿不完善的反转,从而产生了现实和视图一致的新颖透视图像。在重新阶段,以前的作品着重于单一视图肖像重新审议,但也忽略了不同观点之间的一致性,引导不稳定和不一致的照明效果以进行视图变化。我们通过将其多视图输入正常地图与物理面部模型统一,以解决此问题。叙事通过一致的正常地图进行重新进行重新,施加了跨视图的约束并表现出稳定且连贯的照明效果。我们在实验上证明,叙述在先前的工作中取得了更现实的,可靠的结果。我们进一步使用动画和样式转移工具进行介绍,从而分别或组合姿势变化,灯光变化,面部动画和样式转移,所有这些都以摄影质量为单位。我们展示了生动的自由视图面部动画以及3D感知可靠的风格化,可帮助促进各种AR/VR应用程序,例如虚拟摄影,3D视频会议和后期制作。
translated by 谷歌翻译
在本文中,我们研究了在非全粒图上进行节点表示学习的自我监督学习的问题。现有的自我监督学习方法通​​常假定该图是同质的,其中链接的节点通常属于同一类或具有相似的特征。但是,这种同质性的假设在现实图表中并不总是正确的。我们通过为图神经网络开发脱钩的自我监督学习(DSSL)框架来解决这个问题。 DSSL模仿了节点的生成过程和语义结构的潜在变量建模的链接,该过程将不同邻域之间的不同基础语义解散到自我监督的节点学习过程中。我们的DSSL框架对编码器不可知,不需要预制的增强,因此对不同的图表灵活。为了通过潜在变量有效地优化框架,我们得出了自我监督目标的较低范围的证据,并开发了具有变异推理的可扩展培训算法。我们提供理论分析,以证明DSSL享有更好的下游性能。与竞争性的自我监督学习基线相比,对各种类图基准的广泛实验表明,我们提出的框架可以显着取得更好的性能。
translated by 谷歌翻译
这项工作研究了针对推荐系统的有偏见反馈中学习无偏算法的问题。我们从理论和算法的角度解决了这个问题。无偏学习的最新著作通过各种技术(例如元学习,知识蒸馏和信息瓶颈)推进了最新技术。尽管取得了经验成功,但大多数人缺乏理论保证,在理论和最近的算法之间形成了不可忽略的差距。为此,我们首先从分配转移的角度查看无偏见的推荐问题。我们理论上分析了公正学习的概括界限,并提出了它们与最近无偏学习目标的密切关系。基于理论分析,我们进一步提出了一个原则性的框架,对抗性自我训练(AST),以无偏见。对现实世界和半合成数据集的经验评估证明了拟议的AST的有效性。
translated by 谷歌翻译
非负矩阵分解(NMF)已广泛用于降低机器学习的尺寸。但是,传统的NMF无法正确处理异常值,因此对噪声敏感。为了提高NMF的鲁棒性,本文提出了一种自适应加权NMF,它引入了权重,以强调每个数据点的不同重要性,因此降低了对噪声数据的算法敏感性。它与使用缓慢生长相似性度量的现有强大NMF大不相同。具体而言,提出了两种实现这一目标的策略:模糊加权技术和熵加权技术,两者都导致具有简单形式的迭代解决方案。实验结果表明,新方法在具有噪声的几个真实数据集上具有更健壮的特征表示,而不是进行噪声。
translated by 谷歌翻译